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In this paper, we develop statistical inference techniques for the unknown coefficient functions 
and single-index parameters in single-index varying-coefficient models. We first estimate the 
nonparametric component via the local linear fitting, then construct an estimated empirical 
likelihood ratio function and hence obtain a maximum empirical likelihood estimator for the 
parametric component. Our estimator for parametric component is asymptotically efficient, and 
the estimator of nonparametric component has an optimal convergence rate. Our results provide 
ways to construct the confidence region for the involved unknown parameter. We also develop 
an adjusted empirical likelihood ratio for constructing the confidence regions of parameters of 
interest. A simulation study is conducted to evaluate the finite sample behaviors of the proposed 
methods. 
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1. Introduction 

Consider a single-index varying-coefficient model of the form 

Y = g^(^X)Z + e, (1.1) 

where (X, Z) £ R p x R q is a vector of covariates, Y is the response variable, /3q is an 
px 1 vector of unknown parameters, go(0 is an q x 1 vector of unknown functions and e 
is a random error with mean and finite variance a 2 . Assume that e and (X,Z) are 
independent. For the sake of idcntifiability, it is often assumed that ||/3o|| = lj and the 
first non-zero element is positive, where || • || denotes the Euclidean metric. 

Model (1.1) includes a class of important statistical models. For example, if q = 1 and 
Z = 1, (1.1) reduces to the single-index model (see, e.g., Hardle, Hall and Ichimura [11], 
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Weisberg and Welsh [24], Zhu and Fang [33], Chiou and Miiller [6], Hristache, Judit- 
sky and Spokoiny [13], Xue and Zhu [31]). If p = 1 and (3 = 1, (1.1) is the varying- 
coefficient model (see, e.g., Chen and Tsay [5], Hastie and Tibshirani [12], Wu, Chiang 
and Hoover [25], Fan and Zhang [10], Cai, Fan and Li [2], Cai, Fan and Yao [3], Xue 
and Zhu [29]). If the last component of /3q to be non-zero and Z = (1,X*T) T where X* 
is the remaining vector of X with its pth component deleted, (1.1) becomes the adap- 
tive varying-cocfhcient linear model (see, e.g., Fan, Yao and Cai [9], Lu, Tj0stheim and 
Yao [15]). 

Model (1.1) is easily interpreted in real applications because it has the features of 
the single- index model and the varying-cocfficicnt model. In addition, model (1.1) may 
include cross-product terms of some components of X and Z. Hence it has considerable 
flexibility to cater for complex multivariate nonlinear structure. Xia and Li [26] investi- 
gated a class of single- index coefficient regression models, which include model (1.1) as 
a special example. When it is used as a nonparamctric time series model, Xia and Li [26] 
obtained the estimator of g(-) by kernel smoothing and then derived the estimator of (3 
by the least squares method and proved that the corresponding estimators are consistent 
and asymptotically normal. 

In this paper, we develop statistical inference techniques of go(-) and /?o with indepen- 
dent observations of ( Y, X, Z) . We can construct an empirical likelihood ratio function 
for /3o by assuming go(-) and its derivative to be known functions. In practice, however, 
they are unknown, and hence the empirical likelihood ratio function cannot be used to 
make inference on (3. This motivates us to estimate the unknown go(-) and go(-) via 
the local linear smoother, and then obtain an estimated empirical likelihood ratio of [3q. 
The estimated empirical log-likelihood ratio is asymptotically distributed as a weighted 
sum of independent Xi variables with unknown weights. This result cannot be applied 
directly to construct confidence region for /?o- To solve this issue, two methods may be 
used (see Wang and Rao [22]). The first method is to estimate the unknown weights 
consistently so that the distribution of the estimated weighted sum of chi-squared vari- 
ables can be estimated from the data. The second method is to adjust the estimated 
empirical log-likelihood ratio so that the resulting adjusted empirical log-likelihood ratio 
is asymptotically chi-squared. Also, we obtain a maximum empirical likelihood estimator 
of Pq , by maximizing the estimated empirical likelihood ratio function, and investigate its 
asymptotic property. In addition, we obtain the convergence rate of the estimator of a 2 
and define the consistent estimator of asymptotic variance; this allows us to construct 
a confidence region for /3q. 

Comparing with the existing methods, our estimating method has the following ad- 
vantage: The asymptotic variance of our estimator for /?o is the same as those of Hardle 
et al. [11] and Xia and Li [26] when the model reduces to the single- index model; this 
shows that our estimator for (3q is the same efficient as than those of Hardle et al. [11] 
and Xia and Li [26]. The difference between the proposed estimating approaches and 
the existing estimating approaches is that we use an empirical likelihood ratio to define 
the estimator of /3o while the existing work uses the least squares techniques (see, e.g., 
Hardle et al. [11], Xia and Li [26]). Also, we develop an empirical likelihood inference 
for constructing a confidence region of /3. The empirical likelihood method, introduced 
by Owen [17], has many advantages for constructing confidence regions or intervals. For 
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example, it does not impose prior constraints on region shape, and it does not require the 
construction of a pivotal quantity. The empirical likelihood has been studied by many 
authors. The related works are Wang and Rao [22], Wang, Linton and Hardle [21], Xue 
and Zhu [29-31], Zhu and Xue [32], Qin and Zhang [18], Stutc, Xue and Zhu [20], Xue 
[27, 28], Wang and Xue [23], among others. 

The rest of the paper is organized as follows. In Section 2, we define an estimated 
empirical likelihood ratio, and then obtain a maximum empirical likelihood estimator 
of /3q by maximizing the empirical likelihood ratio function; the asymptotic properties of 
the proposed estimators are also investigated. In Section 3, we define an adjusted empir- 
ical log-likelihood and derive its asymptotic distribution. Section 4 reports a simulation 
study. Proofs of theorems are relegated to the Appendix. It should be pointed that some 
special techniques are used in the proofs. 

2. Estimated empirical likelihood 
2.1. Methodology 

Suppose that {(Yl, Xi, Zj); 1 < i < n} is an independent and identically distributed (i.i.d.) 
sample from (1.1), that is 



where e^s are i.i.d. random errors with mean and finite variance a 1 . Assume that 
{ei] 1 < i < n} are independent of {(Xi, Zi); 1 < i < n}. 

To construct an empirical likelihood ratio function for (3q, we introduce an auxiliary 
random vector 



where go(') stands for the derivative of the function vector go(')i and w(-) is a bounded 
weight function with a bounded support IA W , which is introduced to control the boundary 
effect in the estimations of go(-) and go(0- To convenience, we take that w(-) is the 
indicator function of the set U w . Note that E{r)i(f3f} = if (3 = /?o- Hence, the problem 
of testing whether j3 is the true parameter is equivalent to testing whether E{r]i((3f} = 
for i = 1, 2, . . . , n. By Owen [17], this can be done by using the empirical likelihood. That 
is, we can define the profile empirical likelihood ratio function 



It can be shown that — 21ogL ra (/3 ) is asymptotically chi-squared with p degrees of free- 
dom. However, L n ((3) cannot be directly used to make statistical inference on /3 because 
L n (j5) contains the unknowns go(-) and go(')- A natural way is to replace go(') and go(-) 
in L„ (f3) by their estimators and define an estimated empirical likelihood function. In 



i = l,...,n, 



JHifi) = {Y t - &(fFX i )Z i }&{0 r X i )Z i X i w(fFX i ), 



(2.1) 




4 



this paper, we estimate the vector functions go(-) and g (-) via the local linear regression 
technique (see, e.g., Fan and Gijbcls [8]). The local linear estimators for go(u) and go(u) 
are defined as g(u; A)) = a and g(u; A)) = b at the fixed point f3 , where a and b minimize 
the sum of weighted squares 

n 

Y^lYi - {a + H^Xt - u)} T Z i ] 2 K h (p'Sx i - u), 

i=l 

where Kh(-) = h K(-/h), K(-) is a kernel function, and h = h n is a bandwidth sequence 
that decreases to as n increases to oo. It follows from the least squares theory that 

(g T (u; A>), hg T (u; A>)) T = S" 1 ^; A>)&»(u; A>)> 

where 

Sn(u;M=[ SnAu . M SnAu . M and 6.(«5A)=U lU;jgb) 



with 



and 



71 
i=l 



e„j(u; A)) = - £ Z ^ ( ^ X t - ) J K h {^Xi u). 



n ■ 

i=l 



Since the convergence rate of the estimator of g' (u) is slower than that of the estimator 
of go(u) if the same bandwidth is used, this leads to a slower convergence rate for the 
estimator f3 of A) than y/n. To increase the convergence rate of the estimator of g' (u), 
we introduce the another bandwidth hi to replace h in g(it;/3), and define as g h (u;0). 

Let fji(fi) be r/i(A), with g (/3 T Xj) and g (/3 T Xj) replaced by g(/3 T Xi;P) and 
g hi (/3 T Xi; f3) , respectively, for i = l,...,n. Then an estimated empirical likelihood ra- 
tio function is defined by 



I i=i i=i i=i J 

By the Lagrange multiplier method, logL(A) can be represented as 

n 

logi(A) = - J>g(l + X T fji(f3)), (2.2) 
i=i 

where A is determined by 



1 V Ml -o (23) 
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Let B = {/3 E R p : \\(3\\ = 1, and the first non-zero element is positive. Then (3a is an 
inner point of the set B. Therefore we need only search for /3q over B. A maximum 
empirical likelihood estimator for fia is given by 

/3 = argsupL(/3). (2.4) 

/3eB 

With (3, we define the estimate of g(w) by g(u) = g(u,/3), and the estimate of a 1 by 

^ = -Y.{Y^g T CP T X^mY. (2.5) 

t=i 

It is well known that if /3 is known, the optimal bandwidth h for g(it) is of order 
0(n -1 / 5 ). However, if (3 is unknown, in order to ensure that the estimator (3 is root-n 
consistent, the bandwidth h should be smaller than 0(?i -1 / 5 ), if we only assume g(-) 
are second-order differentiable (see Theorem 2 below). Note that once the estimator {3 
is available, an optimal bandwidth of order 0(n -1 / 5 ) can be used in the final estimator 
for g(-). 

2.2. Asymptotic properties 

In order to obtain the asymptotic behaviors of our estimators, we first give the following 
conditions: 

(CI) The density function of f3 T X, f(u), is bounded away from zero for u £ U w and (3 
near /3q, and satisfies the Lipschitz condition of order 1 on U w , where U w is the 
support of w(u). 

(C2) The functions <?j(u), 1 < j < q, have continuous second derivatives on U w , where 

gj(u) are the jth components of go(u). 
(C3) E{\\X\\ e ) < oo, £(||Z|| 6 ) < oo and £^(|e| 6 ) < oo. 
(C4) nh 2 / log 2 oo, nh 4 \ogn-> 0; nhh\/ log 2 n oo, nh\ = 0(1). 
(C5) The kernel K (•) is a symmetric probability density function with a bounded 

support and satisfies the Lipschitz condition of order 1 and J u 2 K(u)du^Q. 
(C6) The matrix D{u) = E(ZZ t \(3q X = u) is positive definite, and each entry of D{u) 

and C(u) = E(VZ t \/3qX = u) satisfies the Lipschitz condition of order 1 on U w , 

where V = Xg r (f3 r X)Zw(f3 r X), and U w is defined in (CI). 
(C7) The matrices B{(3 ) = E(VV T ) and B»(/3 ) = B(fio)- E{C(0S 'X)g (p% X)E(X T \ 

PqX)} are positive definite, where V is defined in (C6). 

Remark 1. Condition (CI) is used to bound the density function of f3 T X away from 
zero. This ensures that the denominators of g(u; (3) and g(u; (3) are, in probability one, 
bounded away from for u <E IA W . The second derivatives in (C2) arc standard smoothness 
conditions. (C3)-(C5) are necessary conditions for the asymptotic normality or the uni- 
form consistency of the estimators. Conditions (C6) and (C7) ensure that the asymptotic 
variance for the estimator of /3q exists. 
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Let B n = {/3 G B: ||/3 — /?o|| < cqu 1 / 2 } for some positive constant cq. This is motivated 
by the fact that, since we anticipate that j3 is root-n consistent, we should look for 
a maximum of L(f3) which involves (3 distant from /?o by order "nT 1 ! 2 . Similar restrictions 
were also made by Ffardle, Hall and Ichimura [11], Xia and Li [26] and Wang and Xue [23]. 

The following theorem shows that — 21ogL(/?o) is asymptotically distributed as 
a weighted sum of independent Xi variables. 

Theorem 1. Suppose that conditions (C1)-(C7) hold. Then 

-21ogL(/? ) — > WiXi,i + ■■■ + WpXi, P , 

where represents convergence in distribution, x\ \ i ■ ■ ■ > Xi p are independent x\ vari- 
ables and the weights Wj, for 1 < j <p, are the eigenvalues of G(/3q) = B~ 1 (/3o)A(/3o). 
Here B{Pq) is defined in condition (C7), 

A(J3q) = B(p Q ) - E{C^X)D- 1 (^X)C T ^X)} 7 (2.6) 

and C(u) and D(u) are defined in condition (C6). 

To apply Theorem 1 to construct a confidence region or interval for /3o , we need to con- 
sistently estimate the unknown weights Wj. By the "plug-in" method, A(Pq) and B(Pq) 
can be consistently estimated by 

1 ™ 

= £ E< W ~ C$ T X i )b- 1 {0 r X i )& r {p r X i )} (2.7) 

i=l 

and 

n 
i=l 

respectively, where $ is the maximum empirical likelihood estimator of /?o defined 
by (2.4), Vt = X i i T $ r X i ;$)Z i <w$ r Xi)> C(-) = E^i^O^f and ^(0 = 
Y;?=iWni(-)ZiZT with 

where K\{-) is a kernel function, and b n is a bandwidth with < b n — > 0. 

This implies that the eigenvalues of G(p) = B- 1 0)A0), say Wj, consistently esti- 
mate uij for j = 1, . . . ,p. Let Ci_ Q be the 1 — a quantile of the conditional distribution of 
the weighted sum s = w\x\ i + • ■ ■ + WpXi P given the data. Then an approximate 1 — a 
confidence region for (3 can be defined as 

K ccl (a) = {/3 e B: -2 logL(/3) < Cl . a }. 
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In practice, the conditional distribution of the weighted sum s, given the sample 
{(Yi,Xi,Zi), 1 <i<n}, can be calculated using Monte Carlo simulations by repeatedly 
generating independent samples Xi,ii ■ • ■ > Xi,p horn the Xi distribution. 

The following theorem states an interesting result about (3. The asymptotic variance of 
is smaller than that of Hardlc et al. [11] when our model reduces to a single- index model. 

Theorem 2. Suppose that conditions (C1)-(C7) hold. Then 

Vn~0 - (3 ) A N(Q, a 2 B- 1 (f3 )A(f3 )B- 1 (l3 )), 
where B*(/3q) and A(/3q) are defined in condition (C7) and (2.6), respectively. 

In model (1.1), if q = 1 and Z = 1, then (1.1) reduces to the single-index model. By 
Theorem 2, we derive the following result. 

Corollary 1. Suppose that the conditions of Theorem 2 hold. If q = 1 and Z = 1 in 
model (1.1), then 

MP-M -^N(0,a 2 A^([3 Q )), 

where A 1 (f3 ) = E[{X - E(X\(3£x)}{X - E(X\^X)} T gl((3^X)w((3^X)} and A± rep- 
resents a generalized inverse of the matrix A 1 . 

Corollary 1 is the same as the results of Hardle et al. [11] and Xia and Li [26] for the 
single-index model. 

For the estimator of the variance of error, a 2 , we have the following result. 

Theorem 3. Suppose that conditions (C1)-(C7) hold. Then, 

d 2 -a 2 = Op(n- 1 ' 2 ). 

To apply Theorem 2 to construction of the confidence region of /3o, we use the esti- 
mators <j 2 and A((3) defined in (2.5) and (2.7), and define the estimator of i?*(/3o) as 
follows 

1 " 

= - V{F^ T - C0 T X t )g0 T X i; [3)fi T {i3 T X,)}, 
n * — ' 
i=i 

where /*(•) = X^=i W n i(-)Xi is the estimator of fj,(u) = E(X\/3qX = u). It can be shown 

p p p 

that A(f3) — ► A((3q) and S*(/3) — > B*(f3 ), where — > denotes convergence in probabil- 
ity. By Theorems 3 and 4, we have 

{^fl- W^)** -1 ^)} -1 ' 2 MP - Po) A N(0,I p ). 
Using Theorem 10. 2d in Arnold [1], we obtain 

- (Sofin- 1 * 2 ^ 1 0)A{[3)B- 1 0)}~ - f3 ) A X \. 
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Let Xp(l — °) be the 1 — a quantile of Xp for < a < 1. Then 

{p: - /?) T (n- 1 ( r 2 B- 1 (/3)i(/3)B; 1 (/3))-(/3 -p)< X 2 p (l - a)} 
gives an approximate 1 — a confidence region for /3q. 



3. Adjusted empirical likelihood 

In addition to the above, direct way of approximating the asymptotic distributions, we 
can also consider the following alternative. The alternative is motivated by the results of 
Rao and Scott [19]. By Rao and Scott [19] the distribution of p(fto) Sf=i w iXi % can be 
approximated by Xp> where p(/3o) = p/ tr{G(/3 )}. Let p0) = p/ tr{G(/3)} with G0) = 
A 1 / 2 0)B~ 1 0)A 1 / 2 0), where A0) and B0) are defined in (2.7) and (2.8). Invoking 
Theorem 1 and the consistency of G0), the asymptotic distribution of p(f3){— 2 logL(/3)} 
can be approximated by Xp- Clearly, (3 in p(-) can be replaced by /?. Therefore, an 
improved Rao-Scott adjusted empirical log-likelihood can be defined as 

[(/?)= p(/3){-21ogi(/3)}. 



However, the accuracy of this approximation still depends on the values of the WiS. Now, 
we propose another adjusted empirical log-likelihood, whose asymptotic distribution is 
chi-squared with p degrees of freedom. The adjustment technique is developed by Wang 
and Rao [22] by using an approximate result in Rao and Scott [19]. Note that p(j3) can 
be written as 

= tr{i-Q3)iQ3)} 
tr{S-i (/?)!(/?)}' 

By examining the asymptotic expansion of — 21ogL(/3), which is specified in the proof of 
Theorem 4 below, we define an adjustment factor 

m = tr{A-(mm 



tr{B-i(/3)S(/3)} 



by replacing A{fi) in fi{fi) by £(/?), where ±(fi) = {^tx mWHEti W)} T ■ The ad- 
justed empirical log-likelihood ratio is defined by 

U0)=r(P){-2logL(p)}, (3.1) 

where logi(/3) is defined in (2.2). 



Theorem 4. Suppose that conditions (C1)-(C6) hold. Then ? a ei(A)) — ► X% 
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According to Theorem 4, Z a ei(/5) can be used to construct an approximate confidence 
region for f3 - Let 

ftaei(a) = {fi e B: Li(/3) < x*(l - «)}• 

Then, 7?. ae i(a) gives a confidence region for /3o with asymptotically correct coverage 
probability 1 — a. 

4. Numerical results 
4.1. Bandwidth selection 

Various existing bandwidth selection techniques for nonparametric regression, such as the 
cross-validation and generalized cross-validation, can be adapted for the estimation g(-). 
But we, in our simulation, use the modified multi-fold cross-validation (MMCV) criterion 
proposed by Cai, Fan and Yao [3] to select the optimal bandwidth because the algorithm 
is simple and quick. Let m and Q be two given positive integers and n > mQ. The 
basic idea is first to use Q sub-series of lengths n — km (k = 1, . . . , Q) to estimate the 
unknown coefficient functions and then to compute the one-step forecasting error of the 
next section of the sample of lengths m based on the estimated models. More precisely, 
we choose h which minimizes 

Q 

AMS(/i) = ^AMS fe (/i), (4.1) 
fe=i 

where, for k = 1, . . . , Q, 

n — krn-\-m ( q 

AMSfeC/t) = - E y i ~ £ hkWZij 

i—n — km+l \ j— 1 

and {.9j,fc(-)} are computed from the sample {(Yi, Ui, Z{), 1 < i < n — km} with bandwidth 
equal h{ n ™ km ) 1 ^ 5 - Note that for different sample size, we re-scale bandwidth according 
to its optimal rate, that is, h oc n" 1 / 5 . Since the selected bandwidth does not depend 
critically on the choice of m and Q, to computation expediency, we take m = [O.ln] and 
Q = 4 in our simulation. 

Let /i op t be the bandwidth obtained by minimizing (4.1) with respect to h > 0; that 
is, h opt = inf?i>o AMS(/i). Then h op t is the optimal bandwidth for estimating g(-). When 
calculating the empirical likelihood ratios and estimator of /3q , we use the approximation 
bandwidth 

h = ^ptn-^Oogn)- 1 ^ hl = hopU 

because this insures that the required bandwidth has correct order of magnitude for 
the optimal asymptotic performance (see, e.g., Carroll et al. [4]), and the bandwidth h 
satisfies condition (C4). 
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4.2. Simulation study 

We now examine the performance of the procedures described in Sections 2 and 3. Con- 
sider the regression model 

Y t = go($X t ) + g 1 (^X i )Z il + g 2 {^X l )Z l2 + e u (4.2) 

where /?o = (l/\/5, 2/-\/5) T and the e^s arc independent A(0,0.8 2 ) random variables. 
The sample {Xi = (Xn,Xi2) T ; 1 < i < n} was generated from a bivariate uniform dis- 
tribution on [— 1,1] 2 with independent components, {Zi = (Zn, Zi 2 ) T ;1 < i < n} was 
generated from a bivariate normal distribution iV(0, E) with vai(Zn) = v&r(Zi 2 ) = 1 and 
the correlation coefficient between Zn and Zi 2 is p = 0.6. In model (4.2), the coefficient 
functions are go{u) = 12exp(— 2u 2 ), g\(u) = 10m 2 and g 2 {u) = 16sin(7ra). 

For the smoother, we used a local linear smoother with a Epanechnikov kernel K(u) = 
0.75(1 — u 2 )+ with a MMCV bandwidth throughout all smoothing steps. We take the 
weight function w(u) = 3/^/5] The sample size for the simulated data is 100, 

and the run is 500 times in all simulations. 

The confidence regions of /?o and their coverage probabilities, with nominal level 
1 — a = 0.95, were computed from 500 runs. Four methods were used to construct the 
confidence regions: the estimated empirical likelihood (EEL) with a conditional approxi- 
mation, the adjusted empirical likelihood (AEL), the improved Rao-Scott adjusted em- 
pirical likelihood (IRS AEL) and the normal approximation (NA). A comparison among 
three methods was made through coverage accuracies and coverage areas of the confidence 
regions. The simulated results are given in Figure 1. 

From Figure 1 we can see that EEL, AEL and IRS AEL give smaller confidence regions 
than NA, and the region obtained by AEL is much smaller than the others. Thus, AEL 
is the best of the four algorithms. 

The histograms of the 500 estimators of the parameter /3i and /?2 are in Figures 2(a) 
and (b), respectively. The Q-Q plots of the 500 estimators of the parameter (3i and /?2 
are in Figures 2(c) and (d), respectively. 

Figure 2 shows empirically that these estimators are asymptotically normal. The means 
of the estimates of the unknown parameters /?i and /?2 are 0.44734 and 0.89502, re- 
spectively, and their biases (standard deviations) are 0.000131 (0.00302) and 0.000596 
(0.00257), respectively. 

We also consider the average estimates of the coefficient functions go(u), gi(u) and 
32 (u) over the 500 replicates. The estimators gj(-) are assessed via the root mean squared 
errors (RMSE); that is, RMSE = Y? j=Q RMSEj where 



RMSEj 



k=l 



1/2 



and {uk, k = 1, . . . , n gr id} are regular grid points. The boxplot for the 500 RMSEs is given 
in Figure 3. 
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0.446 0.448 

Pi 

Figure 1. Averages of 95% confidence regions of (/Ji,/^), based on EEL (solid curve), AEL 
(dashed curve), IRS AEL (doted curve) and NA (dot-dashed curves) when n — 100. 
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(a) Histogram 




(b) Histogram 




Theoretical Quantiles 

(c) Normal Q-Q Plot 



Theoretical Quantiles 

(d) Normal Q-Q Plot 



Figure 2. (a) for fi\ and (b) for p2- the histograms of the 500 estimators of every parameter, 
the estimated curve of density (solid curve) and the curve of normal density (dashed curve); 
(c) for /3i and (d) for f] 2 ■ the Q-Q plot of the 500 estimators of every parameter. 
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u RMSE RMSE, RMSE 2 RMSE 

(c) (d) 

Figure 3. The true cure (solid curve) and the estimated curve (dashed curve), (a) for go(-), (b) 
for gi(-), ( c ) for <72(-)i (d) the boxplots of the 500 RMSE values in estimations of go(-), gi(-), 
ff2(-) and the sum of the three RMSEs. 

From Figures 3(a)— (c) we see every estimated curve agrees with the true function curve 
very closely. Figure 3(d) shows that all RMSEs of estimates for the unknown functions 
are very small. 

Appendices 

We divide the appendices into Appendix A and Appendix B. The proofs of Theorems 1-4 
are presented in Appendix A, and the proofs of Lemmas 2 and 3 are presented in Ap- 
pendix B. We use c to represent any positive constant which may take a different value 
for each appearance. 

Appendix A: Proofs of theorems 

The following lemma gives uniformly convergent rates of g(u; /3) and g(w; (3). This lemma 
is a straightforward extension of known results in nonparametric function estimation; for 
its proof, the reader may refer to Theorem 2 in Wang and Xue [23], we hence omit the 
proof. 
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Lemma 1. Suppose that conditions (C1)-(C3) ; (C5) and (C6) hold. Then 



sup 



^;/3)-goHHO P ({Mf)} 1/2 + ^) 



and 

1/2 



sup ||I( U ;/3)-g (.)|| = O P ({«M) 



Denote £ = {g: U w x B ^ i? 9 }, |jg|| e — sup ug ^ ,/3gB„ llg( M i/^)ll- From Lemma 1, we 
have ||g — golle = op(l) and ||g — go||c? — °p(1); hence we can assume that g lies in G$ 
with (5 = S n — > and 6 > 0, where 

& = {ge5: ||S — gob < 5, ||g - gob < (A.l) 

Let g (/3 T X;/3) = E{g (ffiX)\p T X} and go(/3 T X;/3) = £{go(/3 T X)|/? T X}, 

Q(g,f3)=E[{Y-g T (f3 T X;l3)Z}g T (f3 T X;l3)ZXw((3 T X)}, (A.2) 

1 " 

Q„(g,/3) = -y2{Y l ~g T (f3 T X l -,f3)Z l }g T (f3 T X l ;f3)Z i X i w((3 T X l ). (A.3) 

i=l 

The following two lemmas are required for obtaining the proofs of the theorems; their 
proofs can be found in Appendix B. 

Lemma 2. Suppose that conditions (C1)-(C6) hold. Then 

sup ||Ji(g,/3)|| = opin- 1 ' 2 ), (A.4) 

(g,/3)e6iXB„ 

sup ||J 2 (g,/3)|| - op!^ 1 / 2 ), (A.5) 
PeB„ 

sup ||J 3 (g,/3)|| = o(7i- 1 / 2 ), (A.6) 

V^Mg,M^N(0,a 2 A(f3 )), (A.7) 

where A(fto) is defined in (2.6), 

J x {g,p) = Q„(g,/3) - Q(g,/3) - Q„(g , A)), 

Mg,P) = Q(g,0)-Q(gD,P) 

- ™(g ((3 T X; /3); /3){g(/3 T X; /?) - g (/3 T X; /?)}, 
J 3 (g,/3) = n7(g (/3 T X),/3){g(/3 T X;/3) - g ((3 T X)} 



(g (/3 T X; /3 ){g(/3 T X; /3 ) - gvi^X; (3)} 
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and 

MM = Q„(go, A>) +^(g (/3 T X),/3o){g(/3 T ^;/3o) - go(A T *)}- 
Lemma 3. Suppose that conditions (C1)-(C6) hold. Then 

sup \\Q n {g,P)\\=Op{n- 1 ' 2 ), (A.8) 

sup \\R n (/3) - a 2 B((3 Q )\\ = o P (l), (A.9) 

sup max \\f)i(P)\\ =o P (n 1/2 ), (A.10) 

/3eB„ l<i<n 

sup ||A(/3)|| -op^ 1 / 2 ), (A.ll) 

/3GB„ 

where Q n (g,/3) is defined in (A. 3), Rn(/3) = n _1 Y^=i Vi(P)v7 W) > B(fio) * s defined in 
condition (C7) and %(/3) is defined in (2.2). 

Proof of Theorem 1. Note that, when j3 = j3o, Lemma 3 also holds. Applying the 
Taylor expansion to (2.2) and invoking Lemma 3, we can obtain 



A T %(A)~2{A T %(A)} 2 



o P (l). 



-2 log L(0 o ) = -X; 
By (2.3) and Lemma 3, we have 

n n 

£{A T %(/? )} 2 = £ A T %(/? ) + o P (l) 

i=l i=l 

and 

£ ft (A)C ( A ) £ ft (A ) + op (n- 1/2 ) . 

i=l J i=l 

This together with (A. 12) proves that 

- 21ogL(/? ) =nQ^(g,/3 )i? I T 1 (A)Qn(g, A) +op(1), 



(A.12) 



(A.13) 



where Q ra (g, A) an d i? n (A) are defined in (A. 3) and (A.9), respectively. From (A.9) of 
Lemma 3 and (A. 13), we obtain 

-21ogL(/3 ) = {(a 2 ^)- 1 / 2 V^Qr l (g,A)} T G(A){(^ 2 ^)" 1/2 v^Qr l (g,A)}+op(l), (A.14) 

where G(/3 ) = A 1 / 2 {/3 )B- 1 ((3 )A 1 / 2 ((3 ). Let G = diag(tOi, . . . , w p ), where w h l<i<p, 
are the eigenvalues of G((3q). Then there exists an orthogonal matrix H such that 
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H t GqH = G(fJo). Using the notations of Lemma 2, we have 

Q„(g,/3) = J!(g,/3) + J 2 (g,/3) + J 3 (g,/3) + J 4 (g,/3o) +Q(go,/3). (A.15) 
Noting that Q(go,/?o) = 0, from the above equation and Lemma 2, we have 

Q„(g,A>) = J 4 (g,/3o)+o P (^ 1 / 2 ). 
Hence, by (A. 7) of Lemma 2, we have 

ff{<7- 2 A-(^ )} 1/2 ViQr i (g^0) ^> iV(0,/p), 

where 7 p is the p x p identity matrix. This together with (A. 14) proves Theorem 1. □ 

Proof of Theorem 2. Under the conditions of Theorem 2, we can follow similar ar- 
guments to those used by Wang and Xue [23] and show that fj is a root-n consistent 
estimator of /?o ■ Because the proof is straightforward, we do not present it here. We next 
demonstrate the asymptotic normality of fj. By Lemma 3 and, similarly to the proof 
of (A. 13), we can obtain 

log£(/3) /3){a 2 B(/3)}- 1 Q„(g, fj) + o P (l), (A.16) 

uniformly for fj G B n , where op(l) tends to in probability uniformly for fj € B n . Since 
the estimator fj is a maximum of logL(/3), and B(fJo) is a positive definite matrix, the 
resulting estimator fj is equivalent to solving the estimation equation Q n (g, fj) = 0; that 
is, Q n (g, $) — 0. Note that Q(go, fio) = 0, and we then have, by Taylor's expansion, that 

Q(go,p) = -B.(A))C9 - A>) + o(n" 1 / 2 ), (A.17) 

uniformly for fj e B„, where £?*(/?o) is the same as that in (A. 9). By (A.15), (A.17) 
and (A.4)-(A.6) of Lemma 2, we have 

Q„(gJ) = J 4 (g, Po) - B*(P O )0 ~ A>) + o P (n- 1 / 2 ). 
Noting that Q„(g, /3) = 0, we get 

v^(/3 - A,) = VEB-\/3o)J4s, A>) + op(1). 
This together with (A. 7) of Lemma 2 proves Theorem 2. □ 

Proof of Theorem 3. Decomposing a 2 into several parts, wc get 

n 1 n 

n n ' 

i— 1 1=1 
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2 



][>{go(Af/3 ) - g{XjfrM T Zi 



n 
»=l 



= Ji + I 2 +h. 
Using the central limit theorem, we have 

VE(h - <r 2 ) = ^ - ^(0, var(e 2 )). 

By Lemma 1, we can obtain 

|/ 2 |<-E||^|| 2 { sup Hg^^-goWllj^op^- 1 / 2 ). 

For ^3, we have 

2 ™ 

4 = - V £ 4 {g (^ T /? ) - g (Zf /? ; A>)} T ^ 

n ^ — ' 



n 
i=i 



n ^ — * 



i=l 



= -?31 + ^32- 

It is not hard to show that /31 = Op(n -1 / 2 ). By Theorems 1 and 3, we obtain 

|l32|<-V(||^||| £i |||X i -E(X i |/3 T X i )||)||^-^o||0 F (l) = F (n- 1 / 2 ). 
n 

i=l 

This together with above results proves Theorem 3. □ 

Proof of Theorem 4. Note that A(/3 ) -A A(f3 ) and B(P ) -A B(f3 ). By the expan- 
sion of Zaei(A)), defined in (3.1) and (A. 16), we get 

Li(/3o) = nQl(g 1 l3 Q ){a- 2 A-(l3 )}Qn(g,l3o)+op(l). (A.18) 
This together with (A. 15) and (A.18) proves Theorem 4. □ 

Appendix B: Proofs of lemmas 

Proof of Lemma 2. We first prove (A. 4). Denote r n (g,/3) = y/n{Q n (g, /3) — Q(g,(3)}. 
Noting that <9(go,/?o) =0, we clearly have 

Ji(g,/3) =n- 1 / 2 {r„(g,/3) - r„(g ,/3 )}. (B.l) 



(B.2) 
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It can be shown that the empirical process {r n (g,/3): g £ Gi,P £ B{\ has the stochastic 
equicontinuity, where B\ = {fj £ B: \\f3 — /3 || < 1} and Q\ arc defined in (A.l) with 
6 = 1, which are subsets of B and G, respectively. The equicontinuity is sufficient for 
proof of (A. 4) since 5 < 1 for large enough n. This stochastic equicontinuity follows 
by checking the conditions of Theorem 1 in Doukhan, Massart and Rio [7]. Therefore, 
we have r„(g,/3) — r n (g ,(3 ) = op(l), uniformly for j3 £ B\ and g £ Q\. This together 
with (B.l) proves (A.4). 

We now prove (A. 5). Define the functional derivative tu(go (•;/?), ft) of Q(g,f3) with 
respect to g(-;/3) at go(-',P) at the direction g(-;/3) — go(s/?) by 

^(go(-;/3),/3){g(-;/?)-go(-;/3)} 

= hm [Q(g (.;/3) +r(g(.;/3) -g (-; /?)),/?) - Q(go(-, P), P)] ■ 
t->o r 

where Q(g,P) is defined in (A. 2). We have 

tu(g (/3 T A; P), P){g{p T X; /3) - g (/3 T X; /?)} 

= -E[{g(p T X;P) - g0 (f3 T X;P)} T ZgT((3 T X- : f3)ZXw(p T X)}. 

It follows from (B.2) that 

J 2 (g, /3) = -£[{g(/? T A; /3) - g (/3 T X)} T ZXZ T 

x {g(f3 T X-[3)- g0 (f3 T X;f3)}w((3 T X)}, 

and hence we have 

u T J 2 {g,p) = - f{g(u;p)-go(u)} T ii u (u) 

_ , (B.3) 
x {g(u; /3) - g (u)}w(u)f(u) du + o P (n 1/2 ) 

for any p-dimension vector to, where (J, u (u) = E{Zlu j ' XZ t \0 i X = u}, and f(u) is the 
probability density of /3 T X. Using the standard argument of nonparametric estimation, 
we can prove 

g(«; P) - ©,(«) = D~ 1 {u){f{u)}~ x ^ n {u] p) + P (c n ), (B.4) 

uniformly for u £ U w and f3 £ B n , where c n = n -1 / 2 + h 2 and D[u) is defined in condi- 
tion (C6). 

1 " 

£„(u;0) = - £ ^{Fi - daipXtiZtiKhipTXi - u). 

i=l 

This together with (B.3) derives that 

w T J 2 (g, 0) = - / {D-^u)^^; /3)}V(«){g(«; /?) - g (u)} du + P (c„) 
= -n" 1 / 2 { 7 „(g, /8) - 7«(go,/3)} + Op(c), 
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where 7 „(g,/3) = n-^Eti^^i)^" 1 ^)^^)^^). Using the 
empirical process techniques, and similarly to the proof of (A. 4), we can show that 
the stochastic equicontinuity of 7„(g,/3), and hence ||7„(g,/3) — 7n(go,/3)|| =op(l). Also, 
n/i 4 = 0(1) implies /i 2 = 0(n -1 / 2 ), and hence c„ = 0(n -1 / 2 ). Thus, the proof of (A. 5) is 
complete. 

We now prove (A.6). Denote tp(g , /3) = g£ {fi T X; (3)ZXw(/3 T X) and p(g,/3) = 
{g(/3 T A; /3) - g (/3 T A; /3)} T Z. It follows from (B.2) that 

J 3 (g, 0) - -£{¥>(g, /?) V(go, /?)} + ^Mg, A#(go, A))} 
= - J B[Mg,/3)-^(g,/3 )}V'(go,/3)] 

-^b(g,A)){V'(go > i8)-^(go,A))}] 

= J 31 (g,/3) + J 32 (g,/3). 
By condition (C2), we get 

|Mg,/3)-^(g,/3o)|| 

= || |{g(/3 T A; (3) ~ g(ffiX; /3 )} - {g (/? T A; /3) - g (/3 T X)}] T Z|| 

= || [{g{PlX- p x ) - g (/3 2 T A)}(/3 - (3 f{X - E{X\ftX)}] T Z\\ 

< c||g- go||e||/3 - A>||(||A - i?(A|/3 T X)||)(!|Z!|), 

where /3i and /?2 are between /? and /3o, and ||^(go, /3)|| < c(||Z||)(||X||). Therefore, we 
have || J3i(g,/3)|| = o(n -1 / 2 ), uniformly for g <G Qg and (3 e B n . Similarly, we can prove 
|| J 32 (g,/3)| =o(n -1 / 2 ), uniformly for g€ Qg and j3 € S n , and hence (A.6) follows. 

Finally, we prove (A. 7). Let fo(u) denote the density function of (3qX. By (B.2) 
and (B.4), and using the dominated convergence theorem (Locvc [14]), we can obtain 

tn(go(/3 T A),/3o){g(/3 T A;/3 ) - go(^X)} 
= - J C(u){g(u; fa) - go(«)}/o(«) du 
1 " 

= --Y,E i C(ffiX i )D- 1 (ffiX i )Z i + o P (c n ). 

i=l 

This together with (A. 3) proves that 

1 - 

J 4 (g,/3 ) = - YWi +o P (c„), 
n * — ' 

i=l 

where Ci = Vi- C{ffi X^D" 1 ^ X^Z, and V { = X iE l{^ Xi)Z iW {^ Xi). Therefore, by 
the central limit theorem and Slutsky's theorem, we get 

1 " 

V^J 4 (g,/3o) = V>Ci +o P (l) A jV(0,<7 2 A(A,)). 
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This proves (A. 7). The proof of Lemma 2 is complete. 
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□ 



Proof of Lemma 3. By (A. 15), (A. 17) and Lemma 2, we can prove (A. 8). We now 
prove (A. 9). Let 

R m {[3) = e i g(J3^X i )Z i X i {w(l3 T X i ) - w{ffiXi)} 

+ s l {i{fi T X l -p)~g^X l )} T Z l X l w{P T X l ) 

+ {gniffiXi) - g{t3 T X l -t3)} T Z l Zjg {^X l )X l w{P T X l ) 

+ {go(ffiXi) - g(fJ T X i; p)} T Z t zT 

x {g(/3 T A i; /3) - &(ffiX i )}X i w(p T X i ). 

Then wc have fji(/3) =7?i(/3o) -\~Rn%{fi% where rji(-) is defined in (2.1), and hence 

1 n 1 n 

eg) = - m (MvT (AO + - E ^ 08)^ C 8 ) 

n * — ' n * — ' 

i=l »=i 

-i n 1 n 

i=l i=l 

= Mi^o) + M 2 08) + M 3 (/3) + M 4 (/3). 

By the law of large numbers, we have Mi(/3 ) a 2 B{(5 a ). Therefore, to prove (A. 9), 
we only need to show that Mk{f3) — > uniformly for f3, k = 2,3,4. 

Let M2 jS t(/3) denote the (s,i) element of M2(/3), and Rni,s(/3) denote the sth compo- 
nent of R n i(/3). Then by the Cauchy-Schwarz inequality, we have 

\M 2 ,M\< (^P^ R kM) {^Y, R kt(A ■ ( B - 6 ) 

It can be shown by a direct calculation that 

1 " 

n 

i=i 



uniformly for /? £ £>„. This together with (B.6) proves that M2(/3) — !► 0, uniformly for 

p p 

(3 G Similarly, it can be shown that M^yfj) — > and M^p) — > 0, uniformly for 
P G B n . This together with (B.5) proves (A.9). 

Similarly to above proof, we can derive (A. 10). (A. 11) can be shown by using (A.8)- 
(A.10), and employing the same arguments used in the proof of (2.14) in Owen [16]. □ 
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